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One of the most widely used methods for community detection in networks is the maximization 
of the quality function known as modularity. Of the many maximization techniques that have been 
used in this context, some of the most conceptually attractive are the spectral methods, which 
are based on the eigenvectors of the modularity matrix. Spectral algorithms have, however, been 
limited by and large to the division of networks into only two or three communities, with divisions 
into more than three being achieved by repeated two-way division. Here we present a spectral 
algorithm that can directly divide a network into any number of communities. The algorithm 
makes use of a mapping from modularity maximization to a vector partitioning problem, combined 
with a fast heuristic for vector partitioning. We compare the performance of this spectral algorithm 
with previous approaches and find it to give superior results, particularly in cases where community 
sizes are unbalanced. We also give demonstrative applications of the algorithm to two real-world 
networks and find that it produces results in good agreement with expectations for the networks 
studied. 


I. INTRODUCTION 

Community detection, the division of the vertices of 
a network into groups such that connections are dense 
within groups and sparser between them, has been a 
topic of vigorous research, particularly within statisti¬ 
cal physics, for some years [l|. A broad range of different 
approaches to the problem have been tried, but perhaps 
those in widest current use are methods based on modu¬ 
larity maximization. Modularity [§] is a scalar objective 
function which assigns a numerical score to any division 
of a network into communities, with higher scores be¬ 
ing associated with divisions that are better in the sense 
of having more edges within communities and fewer be¬ 
tween them. Modularity maximization discovers good 
divisions of a network by finding the ones that have the 
highest modularity scores. Unfortunately, the exhaus¬ 
tive numerical maximization of modularity over all divi¬ 
sions of a network is known to be an NP-hard task Q, 
computationally tractable only for the very smallest of 
networks, so we are forced to rely on approximate opti¬ 
mization heuristics, a large number of which have been 
tried. These include greedy algorithms [4j_ 5j) , simulated 
annealing J@, Q, extremal optimization [81 , genetic al¬ 
gorithms |9J, and the widely used multiscale “Louvain 
method” of Blondel et al. [10], which has been incorpo¬ 
rated into a number of common software packages. 

In this paper we focus on another class of algorithms 
for modularity maximization, the spectral algorithms, 
which are based on the examination of the leading eigen¬ 
values and eigenvectors of the so-called modularity ma¬ 
trix El- These methods are of interest for a num¬ 
ber of reasons. First, they give high-quality results in 
practical situations while also being fast, the eigenvalues 
and vectors normally being calculated using the Lanczos 
method [12], which is highly efficient for the sparse ma¬ 
trices that typically arise in network problems. Second, 
they are conceptually attractive, being based on well- 
understood principles of linear algebra. And third, they 


are, by contrast with most other approaches, amenable 
to formal analysis, for instance using random matrix the¬ 
ory [Tsj, allowing one to make precise statements about 
their performance. 

Spectral methods, however, do have their problems. A 
primary one is that there is no simple principled spectral 
algorithm for dividing a network into an arbitrary num¬ 
ber of communities. Good algorithms exist for two- and 
three-way divisions, and repeated two-way divisions can 
sometimes producegood multiway divisions, but some¬ 
times not [H III Il5| . A better approach, proposed by 
White and Smyth [16[, is to compute several leading 
eigenvectors of the modularity matrix at once, represent 
them as points in a high-dimensional space, and then 
cluster those points using a conventional data clustering 
method—White and Smyth use fc-means. This method, 
which is analogous to previous algorithms for the dif¬ 
ferent but related problem of Laplacian spectral graph 
partitioning [lj, lL8[ , is attractive in that it directly di¬ 
vides a network into the desired number of communities. 
On the other hand, while the strong similarity between 
graph partitioning and modularity maximization (Til . fl9| 
makes it natural to think that fc-means would work in 
this situation, it is not clear what quantity, if any, the 
algorithm of [lfj is optimizing. In particular, the algo¬ 
rithm is not derived as an approximation to modularity 
maximization, so there are no formal guarantees that it 
will indeed maximize modularity, and in practice, as we 
show in this paper, there are situations were it can fail 
badly. 

In this paper, therefore, we introduce a different 
method for single-step, multiway, spectral community de¬ 
tection. Our method is not a generalization of the pre¬ 
vious two-way method, which is based on a relaxation of 
the discrete modularity optimization problem to a con¬ 
tinuous optimization that can be solved by differentia¬ 
tion. Instead the method is based on the observation, 
made previously in [l 4| . that modularity maximization 
is equivalent to a max-sum vector partitioning problem. 
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(A similar equivalence for the graph partitioning problem 
was explored in |2(| Hy.) We propose a simple heuris¬ 
tic for the rapid solution of vector partitioning problems 
and apply it to the task in hand to create an efficient 
multiway community detection algorithm. 


II. SPECTRAL COMMUNITY DETECTION 
AND VECTOR PARTITIONING 


The modularity Q is a score assigned to a given divi¬ 
sion into any number of communities of a given network, 
such that good divisions—those in which most edges fall 
within communities and few edges fall between them- - 
get a high score and bad divisions a low one. Formally, 
the modularity is equal to the fraction of edges that fall 
within communities minus the expected fraction if edges 
are placed at random Consider an undirected, un¬ 
weighted network of n vertices and define an adjacency 
matrix A to represent the network structure with ele¬ 
ments A,j = 1 if vertices i and j are connected by an 
edge and 0 otherwise. Now consider a division of the 
vertices of this network into k non-overlapping groups, 
labeled by integers 1... k, and define p* to be the label 
of the group to which vertex i belongs. Then the modu¬ 
larity is given by |B| 


Q = 
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where di is the degree of vertex *, m is the total number 
of edges in the network, and S s t is the Kronecker delta. 
The modularity may be either positive or negative (or 
zero), with a maximum value of +1. Positive values in¬ 
dicate that the number of edges within groups is greater 
than what one would expect by chance, and large pos¬ 
itive values are considered indicative of a good network 
division. 

For convenience we also define the modularity matrix 
to be the symmetric n x n matrix B with elements 


B .. - A- - didj 
13 ~ 13 2m : 


( 2 ) 


in terms of which the modularity m can be written 


Q — 2 m 53 ■ (3) 

ij 


Given that = dj and JA di = 2m, every row and 

column of the modularity matrix must sum to zero: 




Now consider the problem of dividing a network with 
n vertices into k communities. Since good divisions 
have high modularity scores and low divisions low scores, 
we can find good divisions by maximizing modularity 
over divisions. Exact maximization is known to be very 
slow [H, so we turn instead to approximate methods. Fol¬ 
lowing [1J, |2l| , we note that the delta function in Eq. © 
can be written as 


k 

= 53 ^s,g, ds.g ,; (5) 

s=l 

and since the modularity matrix is symmetric it can al¬ 
ways be written as an eigenvector decomposition 

n 

B l3 = ^ XiUuUji, (6) 

;=1 

where A; is an eigenvalue of B and Uu is an element of 
the orthogonal matrix U whose columns are the corre¬ 
sponding eigenvectors. Without loss of generality, we will 
assume that the eigenvalues are numbered in decreasing 
order: Ai > A 2 > • • • > A n . Combining Eqs. ©, ©, 
and ©, we now have 
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We observe that (apart from the uninteresting lead¬ 
ing constant) this is a sum over eigenvalues A/ times the 
nonnegative quantities [S s > so l ar g es t 

(most positive) contributions to the modularity are typ¬ 
ically made by the terms corresponding to the most pos¬ 
itive eigenvalues. A standard approximation, used in es¬ 
sentially all spectral algorithms, is, instead of maximiz¬ 
ing the entire sum, to maximize only these largest terms, 
neglecting the others. That is, we approximate the mod¬ 
ularity by 


^ _ 2m ^3 53 53 UilSs - 


1 = 1 


( 8 ) 


for some integer p < n. At a minimum, we maximize 
only those terms corresponding to positive values of A;. 
(Maximizing ones corresponding to negative A 1 would re¬ 
duce, not increase, the modularity.) In effect, we are 
making a rank-p approximation to the modularity ma¬ 
trix, based on its leading p eigenvectors, then calculating 
the modularity using that approximation rather than the 
true modularity matrix. 

Noting that all A/ in Eq. © are now positive, we can 
rewrite the equation as 


v r 


® ~ 2 m ^ ^ S y/hUuSa^i 


5=1 1=1 


which implies that the uniform vector 1 = (1,1,1,...) is 
an eigenvector of the modularity matrix with eigenvalue 
zero, a result that will be important shortly. 
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We define a set of n p-dimensional vertex vectors rj with 
elements 


[r t \ t = ^XiU u , ( 10 ) 

in terms of which the modularity is 

> ( n ) 

where the notation i £ s denotes that vertex i is in 
group s. 

In other words, we assign to each vertex a vector r*, 
which can be calculated solely from the structure of the 
network (since it is expressed in terms of the eigenvalues 
and eigenvectors of the modularity matrix) and hence is 
constant throughout the optimization procedure. Then 
the modularity of a division of the network into groups is 
given (apart from the leading constant l/2m) as a sum 
of contributions, one from each group s, equal to the 
square of the sum of the vectors for the vertices in that 
group. Our goal is to find the division that maximizes 
this modularity. 

Generically, problems of this kind are called max-sum 
vector partitioning problems, or just vector partitioning 
for short. In the following section we propose a heuristic 
algorithm to rapidly solve vector partitioning problems 
and show how it can be applied to perform efficient multi¬ 
way spectral community detection in arbitrary networks. 

We have not yet said what the value should be of the 
constant p that specifies the rank at which we approx¬ 
imate the modularity matrix in Eq. ©• We have said 
that p should be no greater than the number of posi¬ 
tive eigenvalues of the modularity matrix. On the other 
hand, as shown in Q, if p is less than k — 1 then the 
division of the network with maximum modularity al¬ 
ways has less than k communities, since there will be at 
least one pair of communities whose amalgamation into 
a single community will increase the modularity. Thus p 
should be greater than or equal to k — 1. In all of the 
calculations presented in this paper we make the minimal 
choice p = k — 1, which gives the fastest algorithm and in 
most cases gives excellent results. However, it is worth 
bearing in mind that larger values of p are possible and, 
in principle, give more accurate approximations to the 
true value of the modularity. 
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III. VECTOR PARTITIONING ALGORITHM 


Vector partitioning is computationally easier than 
many optimization tasks. In particular, it is solvable in 
polynomial, rather than exponential time. A general k- 
way partitioning of n different p-dimensional vectors can 
be solved exactly in time 0(n p ^ fc_1 ^ _1 ) |22j|. Thus if we 
use the leading two eigenvectors of the modularity matrix 
to divide a network into two communities the calculation 


can be done in time O(ro), as shown previously in 14]. 


However the running time quickly becomes less tractable 
for larger numbers of communities. As discussed above, 
for a division of a network into k communities we must 
use at least k — 1 eigenvectors, which gives a running 
time O (n k ~ 2k ). Even for just three communities this 
gives 0(n 3 ), which is practical only for rather small net¬ 
works, and for four communities it gives 0(n 8 ) which 
is entirely impractical. For applications to realistically 
large networks with k > 2, therefore, we must abandon 
exact solution of the problem and look for faster approx¬ 
imate methods. 

Previous approaches to vector partitioning include that 
of Wang et al. [H], who suggest dividing the space of 
vectors into octants (or their generalization in higher di¬ 
mensions) and looking through all 2 fc_1 of them to find 
the k octants that contain the largest numbers of vec¬ 
tors. Then we use these as an initial coarse division and 
assign the remaining vectors to these groups by brute- 
force optimization. This method works reasonably well 
for small values of k but is not ideal as k becomes larger 
because the number of octants increases exponentially 
with k. Richardson et al. 0 proposed a divide-and- 
conquer method that works by splitting the space into oc¬ 
tants again, but then splitting these into smaller wedges, 
and repeating until further subdivision gives no improve¬ 
ment. This method works well for the k = 3 case with 
two eigenvectors but does not generalize well to higher k. 
Alpert and Yao [2lJ proposed a greedy algorithm that 
works for any value of k by adding vectors one by one 
to the set to be partitioned, with vectors of larger mag¬ 
nitude being added first (on the grounds that these con¬ 
tribute most to the sums in Eq. GUO- This method works 
well when the largest magnitude vectors are distributed 
evenly among the final groups, but more poorly when 
they are concentrated in a few groups. Unfortunately, 
as we show in Section llV Al when network communities 
are of unequal sizes the largest vertex vectors do indeed 
tend to be concentrated in a few groups and the method 
of 0 works less well. 

Here we introduce an alternative and well-motivated 
heuristic for finding the solution to vector partitioning 
problems for general values of k. The algorithm is analo¬ 
gous to the fc-means algorithm for the standard data par¬ 
titioning problem. The fc-means method is an algorithm 
for partitioning a set of data points in any number of di¬ 
mensions into k clusters in which we start by choosing k 
index locations or centroids in the space. These could be 
chosen in several ways: entirely at random, at random 
from among the set of data points, or (most commonly) 
as the centroids of some initial approximate partition of 
the data. Once these are chosen, we compute the dis¬ 
tance from each data point to each of the k centroids 
and divide the data points into k groups according to 
which they are closest to. Then we compute the k cen¬ 
troids of these new groups, replace the old centroids with 
the new ones, and repeat. The process continues until 
the centroids stop changing. 

Our algorithm adopts a similar idea for vector parti- 
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tioning, with points being replaced by vectors and dis¬ 
tances replaced by vector inner products. We start by 
choosing an initial set of k group vectors R s , one for 
each group or community s, then we assign each of our 
vertex vectors r i to one of the groups according to which 
group vector it is closest to, in a sense we will define in a 
moment. Then we calculate new group vectors for each 
community from these assignments and repeat. The new 
group vectors are calculated simply as the sums of the 
vertex vectors in each group: 

R s = E r *’ (12) 

z€Es 


so that the modularity, Eq. CUb is equal to 



S 


(13) 


We observe the following property of this modularity. 
Suppose we move a vertex i from one community s to 
another t. Let R s and R# represent the group vectors 
of the two communities excluding the contribution from 
vertex i. Then, before the move, the group vectors of 
the communities are R s + r.; and R t , and after the move 
they are R s and R f + r j. All other communities remain 
unchanged in the meantime and hence the change A Q in 
the modularity upon moving vertex i is 

A Q = ^~ [|R S | 2 + |R t + c | 2 - |R a + r,| 2 - |R t | 2 ] 

2 m L J 

= — -Rjri]. (14) 

m 

Thus the modularity will either increase or decrease de¬ 
pending on which is the larger of the two inner products 
Rfr i and R^ r^. Or, to put that another way, in order to 
maximize the modularity we should assign to vertex i to 
the community whose group vector has the largest inner 
product with r,. 

This then defines our equivalent of “distance” for our 
fc-means style vector partitioning algorithm. Given a set 
of group vectors R s , we calculate the inner product Rj 
between iy and every group vector and then assign ver¬ 
tex i to the community with the highest inner product. 

Note, however, that the group vectors R s and Rt ap¬ 
pearing in Eq. Cl are defined excluding r; itself. To be 
correct, therefore, we should do the same thing in our 
partitioning algorithm. For every vertex vector rthere 
will be one group vector R s that contains that vertex vec¬ 
tor (in the sense of Eq. (1 1 2D 1 and before calculating the 
inner product for that group we should subtract from 
the group vector. In practice this subtraction typically 
makes little difference when the network is large—the 
subtraction or not of a single vertex from a large group 
is not going to change the results much. In many cases, 
therefore, one can omit the subtraction step. On the 
other hand, the algorithm is not significantly slower with 
the subtraction, so one could also argue for its inclusion, 
purely on grounds of correctness. We do include it in the 


calculations of this paper, but in the end it makes little 
difference to the results. 

Our complete vector partitioning algorithm is the fol¬ 
lowing: 

1. Choose an initial set of group vectors R 3 , one for 
each of the k communities. 

2. Compute the inner product R^for all vertices i 
and all communities s, or (R s — ri) T r.; if vertex i 
is currently assigned to group s. 

3. Assign each vertex to the community with which it 
has the highest (most positive) inner product. 

4. Update the group vectors using the definition of 
Eq. (E|). 

5. Repeat from step 2 until the group vectors stop 
changing. (One could also halt when the changes 
become negligible or after some maximum number 
of iterations, just as some fc-means implementations 
also do.) 

See Fig. U] for an illustration of the working of the algo¬ 
rithm. 

We still need to decide how our the initial group vec¬ 
tors should be chosen. In the simplest case we might just 
choose them to be of equal magnitude and point in ran¬ 
dom directions. However, if there is community structure 
in the network then we expect the vertex vectors to be 
clustered, pointing in a small number of directions, with 
no or few vectors pointing in the remaining directions. 
It makes little sense to pick initial group vectors point¬ 
ing in directions well away from where the clusters he, 
so in practice we have found that, rather than giving the 
group vectors random directions, we can get good results 
by picking them randomly from among the vertex vectors 
themselves. This ensures that, if most vectors point in 
a few directions, we will be likely to choose initial group 
vectors that also point in those directions. 

Note in fact that we need only pick k — 1 of the k group 
vectors in this fashion, the final vector being fixed by the 
fact that the group vectors sum to zero. To see this, recall 
that the uniform vector 1 = (1,1,1,...) is always an 
eigenvector of the modularity matrix, which implies that 
the elements of all other eigenvectors- i.e., the columns 
of the orthogonal matrix U —must sum to zero (since 
they must be orthogonal to the uniform vector). Then 
the definition of Eq. El implies that 


EM; = V^E^= 0 ’ ( 15 ) 

i= 1 i= 1 

and hence 

n 

E r i=°, (!6) 

2—1 
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FIG. 1: Depiction of the operation of our vector partitioning 
heuristic for, in this case, a set of two-dimensional vectors be¬ 
ing divided into three groups. The blue lines and dots denote 
the individual vectors. The red lines are the group vectors. 
(The magnitudes of the group vectors have been rescaled to 
fit into the figure—normally they would be much larger, since 
they are the sums of the individual vectors in each group.) 
The dashed lines indicate the borders between communities, 
which are determined both by the angles and relative magni¬ 
tudes of the group vectors. For example, the vector labeled ri 
will be assigned to group 1 in this case, because it has its 
largest inner product with Ri. 


and 


^ ^ r i — ^ D: — 0. (17) 

s s i =1 


Thus, once we have chosen k — 1 of the group vectors 
randomly, the final one is fixed to be equal to minus the 
sum of the rest. 

Since there is a random element in the initialization 
of our algorithm, its result is not always guaranteed to 
be the same, even when applied to the same network 
with the same parameter values; it may give different 
results for the modularity on different runs. In applica¬ 
tions, therefore, we typically do several runs of the al¬ 
gorithm with different initial conditions, choosing from 
among the results the community division that gives the 
highest value of the modularity. 


IV. APPLICATIONS 

In this section we give example applications of our 
method, first to computer-generated test networks and 
then to two real-world examples. 


A. Synthetic networks 

For our first tests of the method we look at a set of 
computer-generated (“synthetic”) benchmark networks 
that contain known community structure. Our goal is 
to see whether, and how accurately, the algorithm can 
recover that structure. In our tests we make use of 
networks generated using the degree-corrected stochastic 
block model [2lt . The stochastic block model (not degree- 
corrected) is a generative model of community-structured 
networks whose origins go back to the 1980s [25|, l26j |. 
Vertices are divided into groups and edges are placed 
between pairs independently at random with probabil¬ 
ities uj s t that depend only on the groups s, t that the 
vertices belong to. If the diagonal probabilities w ss are 
larger than the off-diagonal ones, then the network will 
display classic “assortative” community structure with 
more connections within groups than between them. The 
stochastic block model is unrealistic, however, in gener¬ 
ating a Poisson distribution of vertex degrees, which is 
quite different from the highly right-skewed distributions 
commonly seen in real networks. The degree-corrected 
block model remedies this problem by fixing the (ex¬ 
pected) degrees of the vertices at any values we choose. 
In this model edges are placed independently between 
pairs of vertices i,j with probability didjUj s t, where di is 
the desired degree of vertex i. For a detailed discussion 
see [24| . 

Our tests consist of generating a number of networks 
using the degree-corrected block model, analyzing them 
using our algorithm, then comparing the communities 
found with those planted in the networks in the first 
place. To quantify the similarity of the two sets of 
communities, planted and detected, we make use of a 
standard measure, the normalized mutual information or 
NMI [27|, [1H. The (unnornralized) mutual information of 
two sets X, Y of numbers or measurements is defined to 
be 


nx-,r) = J2Y. p{x, y ) log 

x£X y£Y 


p(x,y) 
p{x)p{y) ’ 


(18) 


where p(x, y) is the joint probability or frequency of 
x and y within the data set and p(x ), p{y) are their 
marginal probabilities. The mutual information mea¬ 
sures how much you learn about one of the two sets of 
measurements by knowing the other. If X and Y are 
uncorrelated then each tells you nothing about the other 
and the mutual information is zero. If they are perfectly 
correlated then each tells you everything about the other 
and the mutual information takes its maximum value, 
which is equal to all of the information that either set 
contains, which is simply the entropy, H(X) or H(Y), of 
the set. 

Having the maximum value of the mutual information 
be equal to the entropy is in some ways inconvenient, 
since we don’t know in advance what that value will be. 
So commonly one normalizes the mutual information by 
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FIG. 2: Normalized mutual information as a function of the parameter 8 for communities detected in randomly generated 
test networks using the vector partitioning algorithm of this paper (red squares) and the k- means method of Ref. 16] (blue 
triangles). The networks consist of n = 3600 vertices each, divided into three communities thus: (a) equally sized communities 
of 1200 vertices each; (b) communities of sizes 1800, 1200, and 600; (c) communities of sizes 2400, 900, and 300. Each data 
point is an average of 100 networks. The vertical dashed line in panel (a) indicates the position of the detectability threshold 
below which all methods must fail [29 ]. 


dividing by the mean of the entropies of the two sets, 
thus: 


NMI(X; Y) = 1r (19) 

' ±[H(X) + H(Y)] 

This normalized value falls in the interval from zero to 
one, with uncorrelated variables giving zero and perfect 
correlation giving one. 

The NMI is commonly used to quantify the match be¬ 
tween two clusterings of the vertices of a network. In 
the present case, the original assignments of vertices to 
groups in the block model (the “planted communities”) 
are used as one set of measurements X and the assign¬ 
ments found by our algorithm (the “inferred communi¬ 
ties”) are the other Y. An NMI of 1 denotes perfect 
recovery of the planted partition; an NMI of 0 indicates 
complete failure. 

In the tests presented here we use networks of n = 3600 
vertices divided into k = 3 communities and with two dif¬ 
ferent (expected) degrees: half the vertices in each group 
have degree 10 and the other half have degree 30. The 
parameters co s t are varied in order to tune the difficulty 
of the community detection according to 

co st = (1 - 8)to™ ndoin + <5w!? t lanted , (20) 


where 5 is a parameter that varies from zero to one and 


. .random 


1 

2m ’ 


planted 


Sst 

* ’ 


( 21 ) 


with m being the total number of edges in the net¬ 
work, as previously. With this choice, the parameter 5 
tunes the edge probabilities from a value of didj/2m 
when <5 = 0, which corresponds to a purely random 
edge distribution with no community structure at all 


(the so-called configuration model [3^ - l32| ) to a value 
of didj/di within each group s and zero between 
groups when S = 1—effectively three separate, uncon¬ 
nected configuration models, one for each group, which 
is the strongest form of community structure one could 
have. This choice of w s t also has the nice property that 
the expected fraction of within-group edges that a vertex 
has is the same for all vertices. 

We have tested our algorithm on these networks us¬ 
ing two eigenvectors to define the vertex vectors (the 
minimum viable number). The results are shown as a 
function of the parameter 8 in Fig. [2] along with results 
for the same networks analyzed by clusterin g th e vertex 
vectors using the /c-means algorithm of Ref. }l6| . 

As 5 —> 1 the community structure in the network 
becomes strong and any reasonable algorithm should be 
able to detect it. As we approach this limit our algorithm 
assigns 100% of vertices to their correct communities and 
the NMI approaches one. Conversely as 8 —>• 0 the com¬ 
munity structure in the network vanishes and neither al¬ 
gorithm should detect anything, so NMI approaches zero. 
Furthermore, it is known that there is a critical strength 
of the structure—which translates to a critical value of 
our parameter <5- below which the structure is so weak 
that no algorithm can detect it |29] . This “detectability 
threshold” is marked in Fig. with a vertical dashed 
line. Above this point it should be possible to detect the 
communities, albeit with a certain error rate, and indeed 
we see that both algorithms achieve a nonzero NMI in 
this region. 

As the figure shows, the vector partitioning algorithm 
does as well or better than /c-means in almost all cases. 
In panel (a) the three communities in the network have 
equal sizes, and in this case the two algorithms perform 
comparably, there being only a small range of parameter 
values in the middle of the plot where vector partitioning 
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(a) vector partitioning 



FIG. 3: Illustration of the division of a synthetic three-group 
network using (a) the algorithm of this paper and (b) the k- 
means algorithm of [lfj]. Shapes indicate the planted commu¬ 
nities while colors indicate the communities found by the two 
algorithms. Observe how the fc-means results assign a good 
portion of vertices belonging to the red and green communi¬ 
ties incorrectly to the blue one, while the vector partitioning 
approach does not have this problem. The network in this 
case has n = 4000 vertices with communities of size 3000, 
500, and 500. 



FIG. 4: Four-way division into communities of a collaboration 
network of scientists at the Santa Fe Institute. Different colors 
and shapes indicate the communities discovered by the vector 
partitioning algorithm of this paper. The communities split 
roughly along lines of research topic. 


outperforms /c-means by a narrow margin. In panels (b) 
and (c) the communities have unequal sizes—moderately 
so in (b) and highly in (c)—and in these cases vector par¬ 
titioning does significantly better than /c-means. Indeed 
for unequal group sizes the /c-means algorithm fails to 
achieve perfect community classification (NMI = 1) even 
in the limit where S = 1. The reason for this is illustrated 
in Fig.[3l which shows a scatter plot of the vertex vectors 
for an illustrative example network along with the com¬ 
munities into which each algorithm divides the vertices 
(shown by the colors). As the figure shows, when the 
groups are unequal in size the largest group is closer to 
the origin than the smaller ones—necessarily so since the 
centroid of the vertex vectors lies at the origin (Eq. (I1GD ). 
This tends to throw off the /c-means algorithm, which by 
definition splits the points into groups of roughly equal 
spatial extent. The vector partitioning method, which 
is (correctly) sensitive only to the direction and not the 
magnitude of the vertex vectors, has no such problems. 


B. Real-world examples 

Our next two example applications are to real-world 
networks, two collaboration networks among scientists. 
The first, taken from Ref. [331, represents scientists work¬ 
ing at the Santa Fe Institute, an interdisciplinary research 
institute in New Mexico. The vertices in the network 
represent the scientists and the edges indicate that two 
scientists coauthored a paper together at least once. The 
network is small enough to allow straightforward visual¬ 
ization of our results and is interesting in that the scien¬ 
tists it represents, in keeping with the interdisciplinary 
mission of the institute, come from a range of different 
research fields, in this case statistical physics, mathemat¬ 


ical ecology, RNA structure, and agent-based modeling. 
It is plausible that the communities in the network might 
reflect these subject areas. 

Figure [4] shows the result of a four-way community di¬ 
vision of this network using vertex vectors constructed 
from the first three eigenvectors of the modularity ma¬ 
trix. Overall the results mirror our expectations, with 
the four subject areas corresponding roughly to the four 
communities found by the method. We note, however, 
that there are also four vertices in the middle-right of the 
figure that are clearly misclassified as being in the “agent- 
based models” group when they would be more plausi¬ 
bly placed in the “structure of RNA” group. This illus¬ 
trates a potential weakness of the algorithm: the defin¬ 
ing feature of these vertices is that their vertex vectors 
have very small magnitude, meaning that they do not 
strongly belong to any group. For such vertices even a 
small error—such as that introduced by making our low- 
rank approximation to the true modularity matrix—can 
alter the direction of the vertex vector substantially and 
hence move a vertex to a different group. Problems like 
this are, in fact, common to many spectral algorithms 
and are typically handled by combining the algorithm 
with a subsequent iterative refinement or “fine tuning” 
step, in which individual vertices or small sets of vertices 
are moved from group to group in an effort to improve the 
value of the modularity ll,|l5j. The spectral algorithm 
is good at determining the “big picture,” rapidly doing an 
overall division of the network into broad groups of ver¬ 
tices; the subsequent fine tuning tidies up the remaining 
details. Based on the results we see here, our algorithm 
might be a good candidate for combination with a fine 
tuning step of this kind. 

Our second real-world example is a collaboration net- 














FIG. 5: The 21 communities found in a collaboration network of network scientists using the algorithm proposed in this paper. 


work of scientists working in the field of network science 
itself and is taken from Ref. [l4j . Apart from being rather 
larger than the Santa Fe Institute network, at 379 scien¬ 
tists, this network also differs in that all its members are, 
ostensibly at least, studying the same subject, so there is 
no obvious “ground truth” for the communities as there 
was in the previous example, or even for how many com¬ 
munities there should be. Choosing the number of com¬ 
munities into which a network should be divided is a deep 
problem in its own right, and one that is not completely 
solved. Here, however, we simply borrow a technique 
from the literature and estimate the number of commu¬ 
nities in the network by counting the real eigenvalues of 
the so-called non-backtracking matrix that are greater 
than the largest real part among the complex eigenval¬ 
ues. (For a discussion of why this is a good heuristic, 
see [3J].) In the present case this suggests that there 
should be 26 communities in the network, so we choose 
k = 26 for our community detection algorithm and con¬ 
struct the vertex vectors from the leading 25 eigenvectors 
of the modularity matrix. The results are shown in Fig. [5] 
In fact, in this case we find that the algorithm does not 
make use of all 26 communities—the figure contains only 
21. Nonetheless, the algorithm has succeeded in finding 
a good division in terms of modularity: the modularity 
value is Q = 0.83, comparable to the value given for ex¬ 


ample in [l5j for the same network. We note, however, 
that, as is typical for larger values of k , the algorithm 
finds a range of different divisions of the network in dif¬ 
ferent runs that all have competitive modularity. The 
existence of competing good community divisions in the 
same network is a well-known phenomenon and has been 
previously discussed for instance by Good et al. [35}. 


V. CONCLUSIONS 

In this paper we have described a mapping of a mul¬ 
tiway spectral community detection method onto a vec¬ 
tor partitioning problem and proposed a simple heuristic 
algorithm for vector partitioning that returns good re¬ 
sults in this application. We have tested our method on 
computer-generated benchmark networks, comparing it 
with a competing spectral algorithm that makes use of 
/e-means clustering, and find our method to give superior 
performance, particularly in cases where the sizes of the 
communities are unequal. We have also given two exam¬ 
ple applications of our method to real-world networks. 

There remain a number of open questions not answered 
in this paper. Although the algorithm we propose is sim¬ 
ple and efficient, it is only approximate and we have no 
formal results on its expected performance. The algo- 
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rithm also assumes we have prior knowledge of the num¬ 
ber of communities in the network, where in reality this 
is not usually the case. Determining the number of com¬ 
munities in a network is an interesting open problem. 
Finally, as we (and others) have pointed out, the best 
community detection methods are typically hybrids of 
two or more elementary methods. It would be interest¬ 
ing to see how the vector partitioning algorithm we pro¬ 
pose works in combination with other methods. These 
problems, however, we leave for future work. 
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